<html>
<head>
<title>Foobar File Format</title>
<body>
<h3>File Format</h3>
<p>The first three lines of the file have very specific meanings.</p>

<p>The first line contains the following values: numberOfWords vowels ignoreDiacriticals</p>

<p>These give, in order, the number of words in the language, the vowels used in constructing the language, and whether or not the language ignores diacritical marks.</p>

<p>The second line contains a list of numbers separated by spaces. The numbers are read in pairs, and describe the lengths of words in the language. The first number of each pair is the length, and the second is the corresponding number of words with that length.</p>

<p>The third line of the language contains a list of words already processed in the language. This prevents duplicates when merging additional word lists. The words are separated by spaces.</p>

<p>Lines after the first three can either describe a cluster of characters (a cluster definition line) or describe the characters that can follow a cluster of characters (a succession definition line). All clusters in a succession definition line must have had a cluster definition line earlier in the file. The easiest way to ensure this is to put all cluster definition lines before all succession definition lines, but this is not enforced.</p>

<p>Cluster definition lines have the format: "CD: &lt;characters&gt; &lt;type&gt; &lt;startFrequency&gt; &lt;middleFrequency&gt; &lt;endFrequency&gt;"</p>

<p>Characters should be the characters in the cluster. Type should be 0 if the cluster is a vowel cluster and 1 if the cluster is a consonant cluster. startFrequency, middleFrequency, and endFrequency should be the number of times the cluster appears at the start, middle, and end of words, respectively.</p>

<p>Succession definition lines have the format: "SD: &lt;characters&gt; &lt;succeedingCharacters&gt; &lt;succeedingCharacters&gt; (etc.)"</p>

<p>characters are the characters in the cluster this succession definition line is for. succeedingCharacters are the characters of one of the clusters that follows this cluster.</p>

</body>
</html>